Skip to content

Conversation

Techassi
Copy link
Member

Part of stackabletech/secret-operator#634 and https://github.com/stackabletech/decisions/issues/62.

This PR adds a new CRD maintainer which reconciles the CRDs when the conversion webhook server certificate is rotated. It additionally provides a oneshot channel to indicate that the initial CRD rollout is done and default custom resources can be applied.

This is a crude first implementation of the CRD and default CR maintenance mechanism. This PR will be used as a test-balloon in stackabletech/secret-operator#634.

In the future, this mechanism should be implemented in a more generic way, similar to kube::Controller.

@Techassi Techassi self-assigned this Sep 23, 2025
@Techassi Techassi moved this to Development: In Progress in Stackable Engineering Sep 23, 2025
@Techassi Techassi changed the title feat: Add CRD maintainer feat!: Add CRD maintainer Sep 23, 2025
@Techassi Techassi marked this pull request as ready for review September 24, 2025 09:08
@Techassi Techassi marked this pull request as draft September 24, 2025 09:36
@Techassi Techassi force-pushed the feat/crd-maintenance branch from 6c305d8 to 20656a6 Compare September 25, 2025 10:12
@Techassi Techassi force-pushed the feat/crd-maintenance branch from 20656a6 to 2fc5c28 Compare September 25, 2025 10:14
@Techassi Techassi marked this pull request as ready for review September 25, 2025 10:18
@Techassi Techassi moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Sep 25, 2025
@sbernauer sbernauer moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Sep 30, 2025
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any example code in secret-op I can look at?
At a first glance it looks like stackabletech/secret-operator#634 is not using it yet

@Techassi Techassi requested a review from sbernauer October 1, 2025 12:16
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Techassi pushed his code, so we have a working example in secret-op.

What I don't like is that the webhook and the maintainer are not integrated into each other and we basically need to configure everything twice, which is annoying and error-prone:

pub async fn conversion_webhook(
    operator_environment: &OperatorEnvironmentOptions,
) -> anyhow::Result<(ConversionWebhookServer, mpsc::Receiver<Certificate>)> {
    let crds_and_handlers = [
        (
            SecretClass::merged_crd(SecretClassVersion::V1Alpha2)?,
            SecretClass::try_convert as fn(_) -> _,
        ),
        (
            TrustStore::merged_crd(TrustStoreVersion::V1Alpha1)?,
            TrustStore::try_convert as fn(_) -> _,
        ),
    ];

    let options = ConversionWebhookOptions {
        socket_addr: ConversionWebhookServer::DEFAULT_SOCKET_ADDRESS,
        namespace: operator_environment.operator_namespace.clone(),
        service_name: operator_environment.operator_service_name.clone(),
    };

    Ok(ConversionWebhookServer::new(crds_and_handlers, options).await?)
}

and further down

            let (maintainer, initial_reconcile_rx) = CustomResourceDefinitionMaintainer::new(
                client.as_kube_client(),
                certificate_rx,
                [
                    SecretClass::merged_crd(SecretClassVersion::V1Alpha2).unwrap(),
                    TrustStore::merged_crd(TrustStoreVersion::V1Alpha1).unwrap(),
                ],
                CustomResourceDefinitionMaintainerOptions {
                    operator_service_name: operator_environment.operator_service_name,
                    operator_namespace: operator_environment.operator_namespace,
                    field_manager: OPERATOR_NAME.to_owned(),
                    webhook_https_port: WebhookServer::DEFAULT_HTTPS_PORT,
                    disabled: maintenance.disable_crd_maintenance,
                },
            );

We duplicate:

  • The listen port
  • operator namespace
  • operator service name
  • The list of CRDS. What happens when I add a CRD to one but not hte other?
  • What stored version the CRDs should use. What happens if we use different onces?
  • We need to know that there is a channel of CA certs in between and need to wire it up

I'd say all of this can be gathered from the webhook.
I'm thinking of a API similar to this (one code place instead of two - keep main.rs clean!)

pub async fn conversion_webhook_and_crd_maintainer(
    client: Client,
    operator_environment: &OperatorEnvironmentOptions,
    maintenance: &MaintenanceOptions,
) -> anyhow::Result<(
    ConversionWebhookServer,
    CustomResourceDefinitionMaintainer,
    oneshot::Receiver<()>,
)> {
    let crds_and_handlers = [
        (
            SecretClass::merged_crd(SecretClassVersion::V1Alpha2)?,
            SecretClass::try_convert as fn(_) -> _,
        ),
        (
            TrustStore::merged_crd(TrustStoreVersion::V1Alpha1)?,
            TrustStore::try_convert as fn(_) -> _,
        ),
    ];

    let options = ConversionWebhookOptions {
        listen_addr: ConversionWebhookServer::DEFAULT_LISTEN_ADDRESS,
        listen_port: ConversionWebhookServer::DEFAULT_HTTPS_PORT,
        namespace: operator_environment.operator_namespace.clone(),
        service_name: operator_environment.operator_service_name.clone(),
    };

    let conversion_webhook = ConversionWebhookServer::new(crds_and_handlers, options).await?;
    let (crd_maintainer, initial_reconcile_rx) =
        CustomResourceDefinitionMaintainer::for_conversion_webhook(
            &conversion_webhook,
            client,
            OPERATOR_NAME.to_owned(),
            maintenance.disable_crd_maintenance,
        );

    Ok((conversion_webhook, crd_maintainer, initial_reconcile_rx))
}

We only specify everything once.

stackable-telemetry = { path = "../stackable-telemetry", optional = true, features = ["clap"] }
stackable-versioned = { path = "../stackable-versioned", optional = true }
stackable-webhook = { path = "../stackable-webhook", optional = true }
stackable-webhook = { path = "../stackable-webhook", optional = true, features = ["maintainer"]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stackable-webhook = { path = "../stackable-webhook", optional = true, features = ["maintainer"]}
stackable-webhook = { path = "../stackable-webhook", optional = true, features = ["maintainer"] }

Copy link
Member Author

@Techassi Techassi Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got removed in 99eae00.

@Techassi Techassi force-pushed the feat/crd-maintenance branch from f5a1841 to 99eae00 Compare October 8, 2025 14:46
@Techassi
Copy link
Member Author

Techassi commented Oct 8, 2025

I had similar ideas to what is mentioned in #1099 (review). Commit 99eae00 implements a simplified API to construct a webhook server in combination with a CRD maintainer.

A follow-up commit in stackabletech/secret-operator#634 will use the new code.

@Techassi Techassi requested a review from sbernauer October 8, 2025 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

2 participants